-
Notifications
You must be signed in to change notification settings - Fork 10
[rocky9_6] History Rebuild for kernel-5.14.0-570.12.1.el9_6, kernel-5.14.0-570.16.1.el9_6, kernel-5.14.0-570.17.1.el9_6 #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Direct copy rather than doing cherry picks and missing some.
jira NONE_AUTOMATION cve CVE-2024-56631 Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Suraj Sonawane <[email protected]> commit f10593a Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-5.14.0-570.12.1.el9_6/f10593ad.failed Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN: BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30 kernel/locking/lockdep.c:5838 __mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912 sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407 In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is called before releasing the open_rel_lock mutex. The kref_put() call may decrement the reference count of sfp to zero, triggering its cleanup through sg_remove_sfp(). This cleanup includes scheduling deferred work via sg_remove_sfp_usercontext(), which ultimately frees sfp. After kref_put(), sg_release() continues to unlock open_rel_lock and may reference sfp or sdp. If sfp has already been freed, this results in a slab-use-after-free error. Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the open_rel_lock mutex. This ensures: - No references to sfp or sdp occur after the reference count is decremented. - Cleanup functions such as sg_remove_sfp() and sg_remove_sfp_usercontext() can safely execute without impacting the mutex handling in sg_release(). The fix has been tested and validated by syzbot. This patch closes the bug reported at the following syzkaller link and ensures proper sequencing of resource cleanup and mutex operations, eliminating the risk of use-after-free errors in sg_release(). Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b Tested-by: [email protected] Fixes: cc833ac ("sg: O_EXCL and other lock handling") Signed-off-by: Suraj Sonawane <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit f10593a) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # drivers/scsi/sg.c
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Kai Mäkisara <[email protected]> commit 98b3788 Commit 9604eea ("scsi: st: Add third party poweron reset handling") in v6.6 added new code to handle the Power On/Reset Unit Attention (POR UA) sense data. This was in addition to the existing method. When this Unit Attention is received, the driver blocks attempts to read, write and some other operations because the reset may have rewinded the tape. Because of the added code, also the initial POR UA resulted in blocking operations, including those that are used to set the driver options after the device is recognized. Also, reading and writing are refused, whereas they succeeded before this commit. Add code to not set pos_unknown to block operations if the POR UA is received from the first test_ready() call after the st device has been created. This restores the behavior before v6.6. Signed-off-by: Kai Mäkisara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: 9604eea ("scsi: st: Add third party poweron reset handling") CC: [email protected] Closes: https://lore.kernel.org/linux-scsi/[email protected]/ Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit 98b3788) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Joshua Hay <[email protected]> commit 52c11d3 On initial driver load, alloc_etherdev_mqs is called with whatever max queue values are provided by the control plane. However, if the driver is loaded on a system where num_online_cpus() returns less than the max queues, the netdev will think there are more queues than are actually available. Only num_online_cpus() will be allocated, but skb_get_queue_mapping(skb) could possibly return an index beyond the range of allocated queues. Consequently, the packet is silently dropped and it appears as if TX is broken. Set the real number of queues during open so the netdev knows how many queues will be allocated. Fixes: 1c325aa ("idpf: configure resources for TX queues") Signed-off-by: Joshua Hay <[email protected]> Reviewed-by: Madhu Chittim <[email protected]> Tested-by: Samuel Salin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> (cherry picked from commit 52c11d3) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2023-52922 Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author YueHaibing <[email protected]> commit 55c3b96 BUG: KASAN: slab-use-after-free in bcm_proc_show+0x969/0xa80 Read of size 8 at addr ffff888155846230 by task cat/7862 CPU: 1 PID: 7862 Comm: cat Not tainted 6.5.0-rc1-00153-gc8746099c197 #230 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0xd5/0x150 print_report+0xc1/0x5e0 kasan_report+0xba/0xf0 bcm_proc_show+0x969/0xa80 seq_read_iter+0x4f6/0x1260 seq_read+0x165/0x210 proc_reg_read+0x227/0x300 vfs_read+0x1d5/0x8d0 ksys_read+0x11e/0x240 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd Allocated by task 7846: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x9e/0xa0 bcm_sendmsg+0x264b/0x44e0 sock_sendmsg+0xda/0x180 ____sys_sendmsg+0x735/0x920 ___sys_sendmsg+0x11d/0x1b0 __sys_sendmsg+0xfa/0x1d0 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 7846: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x27/0x40 ____kasan_slab_free+0x161/0x1c0 slab_free_freelist_hook+0x119/0x220 __kmem_cache_free+0xb4/0x2e0 rcu_core+0x809/0x1bd0 bcm_op is freed before procfs entry be removed in bcm_release(), this lead to bcm_proc_show() may read the freed bcm_op. Fixes: ffd980f ("[CAN]: Add broadcast manager (bcm) protocol") Signed-off-by: YueHaibing <[email protected]> Reviewed-by: Oliver Hartkopp <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/all/[email protected] Cc: [email protected] Signed-off-by: Marc Kleine-Budde <[email protected]> (cherry picked from commit 55c3b96) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Calvin Owens <[email protected]> commit c79a39d Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-5.14.0-570.12.1.el9_6/c79a39dc.failed On a board running ntpd and gpsd, I'm seeing a consistent use-after-free in sys_exit() from gpsd when rebooting: pps pps1: removed ------------[ cut here ]------------ kobject: '(null)' (00000000db4bec24): is not initialized, yet kobject_put() is being called. WARNING: CPU: 2 PID: 440 at lib/kobject.c:734 kobject_put+0x120/0x150 CPU: 2 UID: 299 PID: 440 Comm: gpsd Not tainted 6.11.0-rc6-00308-gb31c44928842 #1 Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : kobject_put+0x120/0x150 lr : kobject_put+0x120/0x150 sp : ffffffc0803d3ae0 x29: ffffffc0803d3ae0 x28: ffffff8042dc9738 x27: 0000000000000001 x26: 0000000000000000 x25: ffffff8042dc9040 x24: ffffff8042dc9440 x23: ffffff80402a4620 x22: ffffff8042ef4bd0 x21: ffffff80405cb600 x20: 000000000008001b x19: ffffff8040b3b6e0 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 696e6920746f6e20 x14: 7369203a29343263 x13: 205d303434542020 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: kobject_put+0x120/0x150 cdev_put+0x20/0x3c __fput+0x2c4/0x2d8 ____fput+0x1c/0x38 task_work_run+0x70/0xfc do_exit+0x2a0/0x924 do_group_exit+0x34/0x90 get_signal+0x7fc/0x8c0 do_signal+0x128/0x13b4 do_notify_resume+0xdc/0x160 el0_svc+0xd4/0xf8 el0t_64_sync_handler+0x140/0x14c el0t_64_sync+0x190/0x194 ---[ end trace 0000000000000000 ]--- ...followed by more symptoms of corruption, with similar stacks: refcount_t: underflow; use-after-free. kernel BUG at lib/list_debug.c:62! Kernel panic - not syncing: Oops - BUG: Fatal exception This happens because pps_device_destruct() frees the pps_device with the embedded cdev immediately after calling cdev_del(), but, as the comment above cdev_del() notes, fops for previously opened cdevs are still callable even after cdev_del() returns. I think this bug has always been there: I can't explain why it suddenly started happening every time I reboot this particular board. In commit d953e0e ("pps: Fix a use-after free bug when unregistering a source."), George Spelvin suggested removing the embedded cdev. That seems like the simplest way to fix this, so I've implemented his suggestion, using __register_chrdev() with pps_idr becoming the source of truth for which minor corresponds to which device. But now that pps_idr defines userspace visibility instead of cdev_add(), we need to be sure the pps->dev refcount can't reach zero while userspace can still find it again. So, the idr_remove() call moves to pps_unregister_cdev(), and pps_idr now holds a reference to pps->dev. pps_core: source serial1 got cdev (251:1) <...> pps pps1: removed pps_core: unregistering pps1 pps_core: deallocating pps1 Fixes: d953e0e ("pps: Fix a use-after free bug when unregistering a source.") Cc: [email protected] Signed-off-by: Calvin Owens <[email protected]> Reviewed-by: Michal Schmidt <[email protected]> Link: https://lore.kernel.org/r/a17975fd5ae99385791929e563f72564edbcf28f.1731383727.git.calvin@wbinvd.org Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit c79a39d) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # drivers/ptp/ptp_ocp.c
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Anumula Murali Mohan Reddy <[email protected]> commit 356983f t4_set_vf_mac_acl() uses pf to set mac addr, but t4vf_get_vf_mac_acl() uses port number to get mac addr, this leads to error when an attempt to set MAC address on VF's of PF2 and PF3. This patch fixes the issue by using port number to set mac address. Fixes: e0cdac6 ("cxgb4vf: configure ports accessible by the VF") Signed-off-by: Anumula Murali Mohan Reddy <[email protected]> Signed-off-by: Potnuri Bharat Teja <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 356983f) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Sourabh Jain <[email protected]> commit 0bdd7ff Commit 683eab9 ("powerpc/fadump: setup additional parameters for dump capture kernel") introduced the additional parameter feature in fadump for HASH MMU with the understanding that GRUB does not use the memory area between 640MB and 768MB for its operation. However, the third patch ("powerpc: increase MIN RMA size for CAS negotiation") in this series is changing the MIN RMA size to 768MB, allowing GRUB to use memory up to 768MB. This makes the fadump reservation for the additional parameter feature for HASH MMU unreliable. To address this, export the MIN_RMA so that the next patch ("powerpc/fadump: fix additional param memory reservation for HASH MMU") can identify the correct memory range for the additional parameter feature in fadump for HASH MMU. Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Sourabh Jain <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit 0bdd7ff) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Sourabh Jain <[email protected]> commit b7bb460 Commit 683eab9 ("powerpc/fadump: setup additional parameters for dump capture kernel") introduced the additional parameter feature in fadump for HASH MMU with the understanding that GRUB does not use the memory area between 640MB and 768MB for its operation. However, the third patch in this series ("powerpc: increase MIN RMA size for CAS negotiation") changes the MIN RMA size to 768MB, allowing GRUB to use memory up to 768MB. This makes the fadump reservation for the additional parameter feature for HASH MMU unreliable. To address this, adjust the memory range for the additional parameter in fadump for HASH MMU. This will ensure that GRUB does not overwrite the memory reserved for fadump's additional parameter in HASH MMU. The new policy for the memory range for the additional parameter in HASH MMU is that the first memory block must be larger than the MIN_RMA size, as the bootloader can use memory up to the MIN_RMA size. The range should be between MIN_RMA and the RMA size (ppc64_rma_size), and it must not overlap with the fadump reserved area. Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Sourabh Jain <[email protected]> Reviewed-by: Hari Bathini <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit b7bb460) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Avnish Chouhan <[email protected]> commit fdc4453 Change RMA size from 512 MB to 768 MB which will result in more RMA at boot time for PowerPC. When PowerPC LPAR use/uses vTPM, Secure Boot or FADump, the 512 MB RMA memory is not sufficient for booting. With this 512 MB RMA, GRUB2 run out of memory and unable to load the necessary. Sometimes even usage of CDROM which requires more memory for installation along with the options mentioned above troubles the boot memory and result in boot failures. Increasing the RMA size will resolves multiple out of memory issues observed in PowerPC. Failure details: 1. GRUB2 kern/ieee1275/init.c:550: mm requested region of size 8513000, flags 1 kern/ieee1275/init.c:563: Cannot satisfy allocation and retain minimum runtime space kern/ieee1275/init.c:550: mm requested region of size 8513000, flags 0 kern/ieee1275/init.c:563: Cannot satisfy allocation and retain minimum runtime space kern/file.c:215: Closing `/ppc/ppc64/initrd.img' ... kern/disk.c:297: Closing `ieee1275//vdevice/v-scsi @30000067/disk8300000000000000'... kern/disk.c:311: Closing `ieee1275//vdevice/v-scsi @30000067/disk8300000000000000' succeeded. kern/file.c:225: Closing `/ppc/ppc64/initrd.img' failed with 3. kern/file.c:148: Opening `/ppc/ppc64/initrd.img' succeeded. error: ../../grub-core/kern/mm.c:552:out of memory. 2. Kernel [ 0.777633] List of all partitions: [ 0.777639] No filesystem could mount root, tried: [ 0.777640] [ 0.777649] Kernel panic - not syncing: VFS: Unable to mount root fs on "" or unknown-block(0,0) [ 0.777658] CPU: 17 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-0.rc4.20.el10.ppc64le #1 [ 0.777669] Hardware name: IBM,9009-22A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.B0 (VL950_149) hv:phyp pSeries [ 0.777678] Call Trace: [ 0.777682] [c000000003db7b60] [c000000001119714] dump_stack_lvl+0x88/0xc4 (unreliable) [ 0.777700] [c000000003db7b90] [c00000000016c274] panic+0x174/0x460 [ 0.777711] [c000000003db7c30] [c00000000200631c] mount_root_generic+0x320/0x354 [ 0.777724] [c000000003db7d00] [c0000000020066f8] prepare_namespace+0x27c/0x2f4 [ 0.777735] [c000000003db7d90] [c000000002005824] kernel_init_freeable+0x254/0x294 [ 0.777747] [c000000003db7df0] [c00000000001131c] kernel_init+0x30/0x1c4 [ 0.777757] [c000000003db7e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c [ 0.777768] --- interrupt: 0 at 0x0 [ 0.784238] pstore: backend (nvram) writing error (-1) [ 0.790447] Rebooting in 10 seconds.. Signed-off-by: Avnish Chouhan <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit fdc4453) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Sourabh Jain <[email protected]> commit 61c403b Update the fadump document to include details about the fadump additional parameter feature. The document includes the following: - Significance of the feature - How to use it - Feature restrictions No functional changes are introduced. Signed-off-by: Sourabh Jain <[email protected]> Reviewed-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/[email protected] (cherry picked from commit 61c403b) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Yang Shi <[email protected]> commit 56a7087 Commit ba0fb44 ("dma-mapping: replace zone_dma_bits by zone_dma_limit") and subsequent patches changed how zone_dma_limit is calculated to allow a reduced ZONE_DMA even when RAM starts above 4GB. Commit 122c234 ("arm64: mm: keep low RAM dma zone") further fixed this to ensure ZONE_DMA remains below U32_MAX if RAM starts below 4GB, especially on platforms that do not have IORT or DT description of the device DMA ranges. While zone boundaries calculation was fixed by the latter commit, zone_dma_limit, used to determine the GFP_DMA flag in the core code, was not updated. This results in excessive use of GFP_DMA and unnecessary ZONE_DMA allocations on some platforms. Update zone_dma_limit to match the actual upper bound of ZONE_DMA. Fixes: ba0fb44 ("dma-mapping: replace zone_dma_bits by zone_dma_limit") Cc: <[email protected]> # 6.12.x Reported-by: Yutang Jiang <[email protected]> Tested-by: Yutang Jiang <[email protected]> Signed-off-by: Yang Shi <[email protected]> Link: https://lore.kernel.org/r/[email protected] [[email protected]: some tweaking of the commit log] Signed-off-by: Catalin Marinas <[email protected]> (cherry picked from commit 56a7087) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Yishai Hadas <[email protected]> commit abc7b3f Memory regions (MR) of type DM (device memory) do not have an associated umem. In the __mlx5_ib_dereg_mr() -> mlx5_free_priv_descs() flow, the code incorrectly takes the wrong branch, attempting to call dma_unmap_single() on a DMA address that is not mapped. This results in a WARN [1], as shown below. The issue is resolved by properly accounting for the DM type and ensuring the correct branch is selected in mlx5_free_priv_descs(). [1] WARNING: CPU: 12 PID: 1346 at drivers/iommu/dma-iommu.c:1230 iommu_dma_unmap_page+0x79/0x90 Modules linked in: ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry ovelay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core CPU: 12 UID: 0 PID: 1346 Comm: ibv_rc_pingpong Not tainted 6.12.0-rc7+ #1631 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:iommu_dma_unmap_page+0x79/0x90 Code: 2b 49 3b 29 72 26 49 3b 69 08 73 20 4d 89 f0 44 89 e9 4c 89 e2 48 89 ee 48 89 df 5b 5d 41 5c 41 5d 41 5e 41 5f e9 07 b8 88 ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66 0f 1f 44 00 RSP: 0018:ffffc90001913a10 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810194b0a8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001 RBP: ffff88810194b0a8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f537abdd740(0000) GS:ffff88885fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f537aeb8000 CR3: 000000010c248001 CR4: 0000000000372eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __warn+0x84/0x190 ? iommu_dma_unmap_page+0x79/0x90 ? report_bug+0xf8/0x1c0 ? handle_bug+0x55/0x90 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? iommu_dma_unmap_page+0x79/0x90 dma_unmap_page_attrs+0xe6/0x290 mlx5_free_priv_descs+0xb0/0xe0 [mlx5_ib] __mlx5_ib_dereg_mr+0x37e/0x520 [mlx5_ib] ? _raw_spin_unlock_irq+0x24/0x40 ? wait_for_completion+0xfe/0x130 ? rdma_restrack_put+0x63/0xe0 [ib_core] ib_dereg_mr_user+0x5f/0x120 [ib_core] ? lock_release+0xc6/0x280 destroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs] uverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs] uobj_destroy+0x3f/0x70 [ib_uverbs] ib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs] ? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs] ? lock_acquire+0xc1/0x2f0 ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] ? ib_uverbs_ioctl+0x116/0x170 [ib_uverbs] ? lock_release+0xc6/0x280 ib_uverbs_ioctl+0xe7/0x170 [ib_uverbs] ? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs] __x64_sys_ioctl+0x1b0/0xa70 do_syscall_64+0x6b/0x140 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f537adaf17b Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffff218f0b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffff218f1d8 RCX: 00007f537adaf17b RDX: 00007ffff218f1c0 RSI: 00000000c0181b01 RDI: 0000000000000003 RBP: 00007ffff218f1a0 R08: 00007f537aa8d010 R09: 0000561ee2e4f270 R10: 00007f537aace3a8 R11: 0000000000000246 R12: 00007ffff218f190 R13: 000000000000001c R14: 0000561ee2e4d7c0 R15: 00007ffff218f450 </TASK> Fixes: f18ec42 ("RDMA/mlx5: Use a union inside mlx5_ib_mr") Signed-off-by: Yishai Hadas <[email protected]> Link: https://patch.msgid.link/2039c22cfc3df02378747ba4d623a558b53fc263.1738587076.git.leon@kernel.org Signed-off-by: Leon Romanovsky <[email protected]> (cherry picked from commit abc7b3f) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2025-21694 Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Rik van Riel <[email protected]> commit cbc5dde Since commit 5cbcb62 ("fs/proc: fix softlockup in __read_vmcore") the number of softlockups in __read_vmcore at kdump time have gone down, but they still happen sometimes. In a memory constrained environment like the kdump image, a softlockup is not just a harmless message, but it can interfere with things like RCU freeing memory, causing the crashdump to get stuck. The second loop in __read_vmcore has a lot more opportunities for natural sleep points, like scheduling out while waiting for a data write to happen, but apparently that is not always enough. Add a cond_resched() to the second loop in __read_vmcore to (hopefully) get rid of the softlockups. Link: https://lkml.kernel.org/r/20250110102821.2a37581b@fangorn Fixes: 5cbcb62 ("fs/proc: fix softlockup in __read_vmcore") Signed-off-by: Rik van Riel <[email protected]> Reported-by: Breno Leitao <[email protected]> Cc: Baoquan He <[email protected]> Cc: Dave Young <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> (cherry picked from commit cbc5dde) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Mike Christie <[email protected]> commit 8604f63 scsi_check_passthrough() is always called, but it doesn't check for if a command completed successfully. As a result, if a command was successful and the caller used SCMD_FAILURE_RESULT_ANY to indicate what failures it wanted to retry, we will end up retrying the command. This will cause delays during device discovery because of the command being sent multiple times. For some USB devices it can also cause the wrong device size to be used. This patch adds a check for if the command was successful. If it is we return immediately instead of trying to match a failure. Fixes: 994724e ("scsi: core: Allow passthrough to request midlayer retries") Reported-by: Kris Karas <[email protected]> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219652 Signed-off-by: Mike Christie <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Reviewed-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit 8604f63) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2024-57807 Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Tomas Henzl <[email protected]> commit 50740f4 This fixes a 'possible circular locking dependency detected' warning CPU0 CPU1 ---- ---- lock(&instance->reset_mutex); lock(&shost->scan_mutex); lock(&instance->reset_mutex); lock(&shost->scan_mutex); Fix this by temporarily releasing the reset_mutex. Signed-off-by: Tomas Henzl <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Chandrakanth Patil <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit 50740f4) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Paulo Alcantara <[email protected]> commit 654292a When the user sets a file or directory as read-only (e.g. ~S_IWUGO), the client will set the ATTR_READONLY attribute by sending an SMB2_SET_INFO request to the server in cifs_setattr_{,nounix}(), but cifsInodeInfo::cifsAttrs will be left unchanged as the client will only update the new file attributes in the next call to {smb311_posix,cifs}_get_inode_info() with the new metadata filled in @DaTa parameter. Commit a18280e ("smb: cilent: set reparse mount points as automounts") mistakenly removed the @DaTa NULL check when calling is_inode_cache_good(), which broke the above case as the new ATTR_READONLY attribute would end up not being updated on files with a read lease. Fix this by updating the inode whenever we have cached metadata in @DaTa parameter. Reported-by: Horst Reiterer <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Fixes: a18280e ("smb: cilent: set reparse mount points as automounts") Cc: [email protected] Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]> Signed-off-by: Steve French <[email protected]> (cherry picked from commit 654292a) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author BH Hsieh <[email protected]> commit 55f1a5f Observed VBUS_OVERRIDE & ID_OVERRIDE might be programmed with unexpected value prior to XUSB PADCTL driver, this could also occur in virtualization scenario. For example, UEFI firmware programs ID_OVERRIDE=GROUNDED to set a type-c port to host mode and keeps the value to kernel. If the type-c port is connected a usb host, below errors can be observed right after usb host mode driver gets probed. The errors would keep until usb role class driver detects the type-c port as device mode and notifies usb device mode driver to set both ID_OVERRIDE and VBUS_OVERRIDE to correct value by XUSB PADCTL driver. [ 173.765814] usb usb3-port2: Cannot enable. Maybe the USB cable is bad? [ 173.765837] usb usb3-port2: config error Taking virtualization into account, asserting XUSB PADCTL reset would break XUSB functions used by other guest OS, hence only reset VBUS & ID OVERRIDE of the port in utmi_phy_init. Fixes: bbf7116 ("phy: tegra: xusb: Add Tegra186 support") Cc: [email protected] Change-Id: Ic63058d4d49b4a1f8f9ab313196e20ad131cc591 Signed-off-by: BH Hsieh <[email protected]> Signed-off-by: Henry Lin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]> (cherry picked from commit 55f1a5f) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Jeff Layton <[email protected]> commit b9382e2 nfsd_file_dispose_list_delayed can be called from the filecache laundrette, which is shut down after the nfsd threads are shut down and the nfsd_serv pointer is cleared. If nn->nfsd_serv is NULL then there are no threads to wake. Ensure that the nn->nfsd_serv pointer is non-NULL before calling svc_wake_up in nfsd_file_dispose_list_delayed. This is safe since the svc_serv is not freed until after the filecache laundrette is cancelled. Reported-by: Salvatore Bonaccorso <[email protected]> Closes: https://bugs.debian.org/1093734 Fixes: ffb4025 ("nfsd: Don't leave work of closing files to a work queue") Cc: [email protected] Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]> (cherry picked from commit b9382e2) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2024-56623 Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Quinn Tran <[email protected]> commit 07c903d System crash is observed with stack trace warning of use after free. There are 2 signals to tell dpc_thread to terminate (UNLOADING flag and kthread_stop). On setting the UNLOADING flag when dpc_thread happens to run at the time and sees the flag, this causes dpc_thread to exit and clean up itself. When kthread_stop is called for final cleanup, this causes use after free. Remove UNLOADING signal to terminate dpc_thread. Use the kthread_stop as the main signal to exit dpc_thread. [596663.812935] kernel BUG at mm/slub.c:294! [596663.812950] invalid opcode: 0000 [#1] SMP PTI [596663.812957] CPU: 13 PID: 1475935 Comm: rmmod Kdump: loaded Tainted: G IOE --------- - - 4.18.0-240.el8.x86_64 #1 [596663.812960] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/20/2012 [596663.812974] RIP: 0010:__slab_free+0x17d/0x360 ... [596663.813008] Call Trace: [596663.813022] ? __dentry_kill+0x121/0x170 [596663.813030] ? _cond_resched+0x15/0x30 [596663.813034] ? _cond_resched+0x15/0x30 [596663.813039] ? wait_for_completion+0x35/0x190 [596663.813048] ? try_to_wake_up+0x63/0x540 [596663.813055] free_task+0x5a/0x60 [596663.813061] kthread_stop+0xf3/0x100 [596663.813103] qla2x00_remove_one+0x284/0x440 [qla2xxx] Cc: [email protected] Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Nilesh Javali <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit 07c903d) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit 03ff378 When function evict_should_delete() returns SHOULD_DEFER_EVICTION, gh is never initialized, but that isn't obvious; if it did initialize gh and then return SHOULD_DEFER_EVICTION, gfs2_evict_inode() would fail to release it. To clarify the code, change gfs2_evict_inode() to always check if gh needs to be released, no matter what evict_should_delete() returns. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 03ff378) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit 5788253 Add a number of glock flags are currently not shown in the text form of glock tracepoints. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 5788253) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit f83f897 Glocks are always actively acquired by processes, but as indicated by the GL_NOPID holder flag, some of them are then associated with objects like cached inodes rather than the process that acquired them. As such, for those glock holders, it makes little sense to dump which processes originally acquired them. Therefore, gfs2 is trying to hide the identity of the processes that acquired those glocks. The code for doing that is incorrect though, so fix it. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit f83f897) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit 8bbfde0 Introduce a new GLF_PENDING_REPLY flag to indicate that a reply from DLM is expected. Include that flag in glock dumps to show more clearly what's going on. (When the GLF_PENDING_REPLY flag is set, the GLF_LOCK flag will also be set but the GLF_LOCK flag alone isn't sufficient to tell that we are waiting for a DLM reply.) Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 8bbfde0) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit 3774f53 Having this flag attached to the iopen glock instead of the inode is much simpler; it eliminates a protential weird race in gfs2_try_evict(). Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 3774f53) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit 0b93bac The last user of this flag was removed in commit b77b4a4 ("gfs2: Rework freeze / thaw logic"). Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 0b93bac) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Su Hui <[email protected]> commit bb25b97 clang static analyzer complains that value stored to 'gh' is never read. The code of this line is useless after commit 0b93bac ("gfs2: Remove LM_FLAG_PRIORITY flag"). Remove this code to save space. Signed-off-by: Su Hui <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit bb25b97) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit 0360fac Remove some more dead code in add_to_queue() that commit 0b93bac ("gfs2: Remove LM_FLAG_PRIORITY flag") has rendered obsolete. This is a continuation of commit 3302764610057 ("gfs2: remove dead code in add_to_queue"); no functional change. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit 0360fac) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit d838605 In run_queue(), check if the queue of pending requests is empty instead of blindly assuming that it won't be. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit d838605) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.12.1.el9_6 commit-author Andreas Gruenbacher <[email protected]> commit a431d49 In finish_xmote(), when a locking request is canceled, the corresponding holder is moved to the tail of the holders list instead of being dequeued immediately. When there is only a single holder, the canceled locking request is then immediately repeated. This makes no sense; it looks like another remnant of LM_FLAG_PRIORITY support. Instead, dequeue canceled holders and proceed with the next holder in finish_xmote(). We can then easily detect in gfs2_glock_dq() when a holder has been canceled. Signed-off-by: Andreas Gruenbacher <[email protected]> (cherry picked from commit a431d49) Signed-off-by: Jonathan Maple <[email protected]>
…l per operation jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit a040c35 Since commit ff0ce72 ("cgroup/cpuset: Eliminate unncessary sched domains rebuilds in hotplug"), there is only one rebuild_sched_domains_locked() call per hotplug operation. However, writing to the various cpuset control files may still casue more than one rebuild_sched_domains_locked() call to happen in some cases. Juri had found that two rebuild_sched_domains_locked() calls in update_prstate(), one from update_cpumasks_hier() and another one from update_partition_sd_lb() could cause cpuset partition to be created with null total_bw for DL tasks. IOW, DL tasks may not be scheduled correctly in such a partition. A sample command sequence that can reproduce null total_bw is as follows. # echo Y >/sys/kernel/debug/sched/verbose # echo +cpuset >/sys/fs/cgroup/cgroup.subtree_control # mkdir /sys/fs/cgroup/test # echo 0-7 > /sys/fs/cgroup/test/cpuset.cpus # echo 6-7 > /sys/fs/cgroup/test/cpuset.cpus.exclusive # echo root >/sys/fs/cgroup/test/cpuset.cpus.partition Fix this double rebuild_sched_domains_locked() calls problem by replacing existing calls with cpuset_force_rebuild() except the rebuild_sched_domains_cpuslocked() call at the end of cpuset_handle_hotplug(). Checking of the force_sd_rebuild flag is now done at the end of cpuset_write_resmask() and update_prstate() to determine if rebuild_sched_domains_locked() should be called or not. The cpuset v1 code can still call rebuild_sched_domains_locked() directly as double rebuild_sched_domains_locked() calls is not possible. Reported-by: Juri Lelli <[email protected]> Closes: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Waiman Long <[email protected]> Tested-by: Juri Lelli <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit a040c35) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit 9b496a8 Isolated CPUs are not allowed to be used in a non-isolated partition. The only exception is the top cpuset which is allowed to contain boot time isolated CPUs. Commit ccac8e8 ("cgroup/cpuset: Fix remote root partition creation problem") introduces a simplified scheme of including only partition roots in sched domain generation. However, it does not properly account for this exception case. This can result in leakage of isolated CPUs into a sched domain. Fix it by making sure that isolated CPUs are excluded from the top cpuset before generating sched domains. Also update the way the boot time isolated CPUs are handled in test_cpuset_prs.sh to make sure that those isolated CPUs are really isolated instead of just skipping them in the tests. Fixes: ccac8e8 ("cgroup/cpuset: Fix remote root partition creation problem") Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit 9b496a8) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit a22b3d5 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-5.14.0-570.17.1.el9_6/a22b3d54.failed There is a possible race between removing a cgroup diectory that is a partition root and the creation of a new partition. The partition to be removed can be dying but still online, it doesn't not currently participate in checking for exclusive CPUs conflict, but the exclusive CPUs are still there in subpartitions_cpus and isolated_cpus. These two cpumasks are global states that affect the operation of cpuset partitions. The exclusive CPUs in dying cpusets will only be removed when cpuset_css_offline() function is called after an RCU delay. As a result, it is possible that a new partition can be created with exclusive CPUs that overlap with those of a dying one. When that dying partition is finally offlined, it removes those overlapping exclusive CPUs from subpartitions_cpus and maybe isolated_cpus resulting in an incorrect CPU configuration. This bug was found when a warning was triggered in remote_partition_disable() during testing because the subpartitions_cpus mask was empty. One possible way to fix this is to iterate the dying cpusets as well and avoid using the exclusive CPUs in those dying cpusets. However, this can still cause random partition creation failures or other anomalies due to racing. A better way to fix this race is to reset the partition state at the moment when a cpuset is being killed. Introduce a new css_killed() CSS function pointer and call it, if defined, before setting CSS_DYING flag in kill_css(). Also update the css_is_dying() helper to use the CSS_DYING flag introduced by commit 33c35aa ("cgroup: Prevent kill_css() from being called more than once") for proper synchronization. Add a new cpuset_css_killed() function to reset the partition state of a valid partition root if it is being killed. Fixes: ee8dde0 ("cpuset: Add new v2 cpuset.sched.partition flag") Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit a22b3d5) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # kernel/cgroup/cpuset.c
…fective_cpumask() jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit 668e041 Before commit f0af1bf ("cgroup/cpuset: Relax constraints to partition & cpus changes"), a cpuset partition cannot be enabled if not all the requested CPUs can be granted from the parent cpuset. After that commit, a cpuset partition can be created even if the requested exclusive CPUs contain CPUs not allowed its parent. The delmask containing exclusive CPUs to be removed from its parent wasn't adjusted accordingly. That is not a problem until the introduction of a new isolated_cpus mask in commit 11e5f40 ("cgroup/cpuset: Keep track of CPUs in isolated partitions") as the CPUs in the delmask may be added directly into isolated_cpus. As a result, isolated_cpus may incorrectly contain CPUs that are not isolated leading to incorrect data reporting. Fix this by adjusting the delmask to reflect the actual exclusive CPUs for the creation of the partition. Fixes: 11e5f40 ("cgroup/cpuset: Keep track of CPUs in isolated partitions") Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit 668e041) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit 8bf450f When remote_partition_disable() is called to disable a remote partition, it always sets the partition to an invalid partition state. It should only do so if an error code (prs_err) has been set. Correct that and add proper error code in places where remote_partition_disable() is called due to error. Fixes: 181c8e0 ("cgroup/cpuset: Introduce remote partition") Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit 8bf450f) Signed-off-by: Jonathan Maple <[email protected]>
…_hier() handle remote partition jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit f62a5d3 Currently, changes in exclusive CPUs are being handled in remote_partition_check() by disabling conflicting remote partitions. However, that may lead to results unexpected by the users. Fix this problem by removing remote_partition_check() and making update_cpumasks_hier() handle changes in descendant remote partitions properly. The compute_effective_exclusive_cpumask() function is enhanced to check the exclusive_cpus and effective_xcpus from siblings and excluded them in its effective exclusive CPUs computation and return a value to show if there is any sibling conflicts. This is somewhat like the cpu_exclusive flag check in validate_change(). This is the initial step to enable us to retire the use of cpu_exclusive flag in cgroup v2 in the future. One of the tests in the TEST_MATRIX of the test_cpuset_prs.sh script has to be updated due to changes in the way a child remote partition root is being handled (updated instead of invalidation) in update_cpumasks_hier(). Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit f62a5d3) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit f0a0bd3 Rename partition_xcpus_newstate() to isolated_cpus_update(), update_partition_exclusive() to update_partition_exclusive_flag() and the new_xcpus_state variable to isolcpus_updated to make their meanings more explicit. Also add some comments to further clarify the code. No functional change is expected. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit f0a0bd3) Signed-off-by: Jonathan Maple <[email protected]>
… and state separator jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit 65046b5 Currently, ',' is used as the cgroup separator of the expected effective CPUs and partition root states in the test matrix. However, ',' can be part of the output of the cpuset.cpus*.effective and cpuset.cpus.isolated files. Change the separator to '|' so that ',' can appear as part of the expected values. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit 65046b5) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit b2b2b4d Cleaning up the test_cpuset_prs.sh script and restructure some of the functions so that a new test matrix with a different cgroup directory structure can be added in the next patch. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit b2b2b4d) Signed-off-by: Jonathan Maple <[email protected]>
…t_prs.sh jira NONE_AUTOMATION Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Waiman Long <[email protected]> commit e8a457b The current cgroup directory layout for running the partition state transition tests is mainly suitable for testing local partitions as well as with a mix of local and remote partitions. It is not that suitable for doing extensive remote partition and nested remote/local partition testing. Add a new set of remote partition tests REMOTE_TEST_MATRIX with another cgroup directory structure more tailored for remote partition testing to provide better code coverage. Also add a few new test cases as well as adjusting existig ones for the original TEST_MATRIX. Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Tejun Heo <[email protected]> (cherry picked from commit e8a457b) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2025-37749 Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Arnaud Lecomte <[email protected]> commit aabc659 Ensure we have enough data in linear buffer from skb before accessing initial bytes. This prevents potential out-of-bounds accesses when processing short packets. When ppp_sync_txmung receives an incoming package with an empty payload: (remote) gef➤ p *(struct pppoe_hdr *) (skb->head + skb->network_header) $18 = { type = 0x1, ver = 0x1, code = 0x0, sid = 0x2, length = 0x0, tag = 0xffff8880371cdb96 } from the skb struct (trimmed) tail = 0x16, end = 0x140, head = 0xffff88803346f400 "4", data = 0xffff88803346f416 ":\377", truesize = 0x380, len = 0x0, data_len = 0x0, mac_len = 0xe, hdr_len = 0x0, it is not safe to access data[2]. Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=29fc8991b0ecb186cf40 Tested-by: [email protected] Fixes: 1da177e ("Linux-2.6.12-rc2") Signed-off-by: Arnaud Lecomte <[email protected]> Link: https://patch.msgid.link/20250408-bound-checking-ppp_txmung-v2-1-94bb6e1b92d0@arnaud-lcm.com [[email protected]: fixed subj typo] Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit aabc659) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2025-21756 Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Michal Luczaj <[email protected]> commit 135ffc7 vsock defines a BPF callback to be invoked when close() is called. However, this callback is never actually executed. As a result, a closed vsock socket is not automatically removed from the sockmap/sockhash. Introduce a dummy vsock_close() and make vsock_release() call proto::close. Note: changes in __vsock_release() look messy, but it's only due to indent level reduction and variables xmas tree reorder. Fixes: 634f1a7 ("vsock: support sockmap") Signed-off-by: Michal Luczaj <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Reviewed-by: Luigi Leonardi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> (cherry picked from commit 135ffc7) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2025-21756 Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Michal Luczaj <[email protected]> commit fcdd224 Preserve sockets bindings; this includes both resulting from an explicit bind() and those implicitly bound through autobind during connect(). Prevents socket unbinding during a transport reassignment, which fixes a use-after-free: 1. vsock_create() (refcnt=1) calls vsock_insert_unbound() (refcnt=2) 2. transport->release() calls vsock_remove_bound() without checking if sk was bound and moved to bound list (refcnt=1) 3. vsock_bind() assumes sk is in unbound list and before __vsock_insert_bound(vsock_bound_sockets()) calls __vsock_remove_bound() which does: list_del_init(&vsk->bound_table); // nop sock_put(&vsk->sk); // refcnt=0 BUG: KASAN: slab-use-after-free in __vsock_bind+0x62e/0x730 Read of size 4 at addr ffff88816b46a74c by task a.out/2057 dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 __vsock_bind+0x62e/0x730 vsock_bind+0x97/0xe0 __sys_bind+0x154/0x1f0 __x64_sys_bind+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Allocated by task 2057: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 __vsock_create.constprop.0+0x2e/0xb60 vsock_create+0xe4/0x420 __sock_create+0x241/0x650 __sys_socket+0xf2/0x1a0 __x64_sys_socket+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 2057: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 __vsock_bind+0x5e1/0x730 vsock_bind+0x97/0xe0 __sys_bind+0x154/0x1f0 __x64_sys_bind+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e refcount_t: addition on 0; use-after-free. WARNING: CPU: 7 PID: 2057 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 RIP: 0010:refcount_warn_saturate+0xce/0x150 __vsock_bind+0x66d/0x730 vsock_bind+0x97/0xe0 __sys_bind+0x154/0x1f0 __x64_sys_bind+0x6e/0xb0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e refcount_t: underflow; use-after-free. WARNING: CPU: 7 PID: 2057 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 RIP: 0010:refcount_warn_saturate+0xee/0x150 vsock_remove_bound+0x187/0x1e0 __vsock_release+0x383/0x4a0 vsock_release+0x90/0x120 __sock_release+0xa3/0x250 sock_close+0x14/0x20 __fput+0x359/0xa80 task_work_run+0x107/0x1d0 do_exit+0x847/0x2560 do_group_exit+0xb8/0x250 __x64_sys_exit_group+0x3a/0x50 x64_sys_call+0xfec/0x14f0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: c0cfa2d ("vsock: add multi-transports support") Reviewed-by: Stefano Garzarella <[email protected]> Signed-off-by: Michal Luczaj <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit fcdd224) Signed-off-by: Jonathan Maple <[email protected]>
jira NONE_AUTOMATION cve CVE-2025-21756 Rebuild_History Non-Buildable kernel-5.14.0-570.17.1.el9_6 commit-author Michal Luczaj <[email protected]> commit 78dafe1 During socket release, sock_orphan() is called without considering that it sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a null pointer dereferenced in virtio_transport_wait_close(). Orphan the socket only after transport release. Partially reverts the 'Fixes:' commit. KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] lock_acquire+0x19e/0x500 _raw_spin_lock_irqsave+0x47/0x70 add_wait_queue+0x46/0x230 virtio_transport_release+0x4e7/0x7f0 __vsock_release+0xfd/0x490 vsock_release+0x90/0x120 __sock_release+0xa3/0x250 sock_close+0x14/0x20 __fput+0x35e/0xa90 __x64_sys_close+0x78/0xd0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c Fixes: fcdd224 ("vsock: Keep the binding until socket destruction") Tested-by: Luigi Leonardi <[email protected]> Reviewed-by: Luigi Leonardi <[email protected]> Signed-off-by: Michal Luczaj <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 78dafe1) Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50% Number of commits in upstream range v5.14~1..kernel-mainline: 296506 Number of commits in rpm: 40 Number of commits matched with upstream: 37 (92.50%) Number of commits in upstream but not in rpm: 296469 Number of commits NOT found in upstream: 3 (7.50%) Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.17.1.el9_6 for kernel-5.14.0-570.17.1.el9_6 Clean Cherry Picks: 35 (94.59%) Empty Cherry Picks: 2 (5.41%) _______________________________ Full Details Located here: ciq/ciq_backports/kernel-5.14.0-570.17.1.el9_6/rebuild.details.txt Includes: * git commit header above * Empty Commits with upstream SHA * RPM ChangeLog Entries that could not be matched Individual Empty Commit failures contained in the same containing directory. The git message for empty commits will have the path for the failed commit. File names are the first 8 characters of the upstream SHA
The CI jobs are failing because it can't find the configs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥌
Yeahhh they are there but not listed in git status:
So let me look into that. |
I think they weren't committed. This doesn't show any config files: https://github.com/ctrliq/kernel-src-tree/tree/54ddec9cf3a439884c0dbc9e3712737438bcdb13/configs |
But it worked for 8_10 which used the exact same process. Which is why I need to look into it |
The configs are messed up here as well closing all this and deleting content to debug. |
jira LE-1907 cve {CVE-2024-33621 cve RHEL-44402] cve [RHEL-44404 cve Liu) cve (Hangbin cve 4,6_outbound Rebuild_History Non-Buildable kernel-5.14.0-427.31.1.el9_4 commit-author Yue Haibing <[email protected]> commit b3dc6e8 Raw packet from PF_PACKET socket ontop of an IPv6-backed ipvlan device will hit WARN_ON_ONCE() in sk_mc_loop() through sch_direct_xmit() path. WARNING: CPU: 2 PID: 0 at net/core/sock.c:775 sk_mc_loop+0x2d/0x70 Modules linked in: sch_netem ipvlan rfkill cirrus drm_shmem_helper sg drm_kms_helper CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Not tainted 6.9.0+ ctrliq#279 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:sk_mc_loop+0x2d/0x70 Code: fa 0f 1f 44 00 00 65 0f b7 15 f7 96 a3 4f 31 c0 66 85 d2 75 26 48 85 ff 74 1c RSP: 0018:ffffa9584015cd78 EFLAGS: 00010212 RAX: 0000000000000011 RBX: ffff91e585793e00 RCX: 0000000002c6a001 RDX: 0000000000000000 RSI: 0000000000000040 RDI: ffff91e589c0f000 RBP: ffff91e5855bd100 R08: 0000000000000000 R09: 3d00545216f43d00 R10: ffff91e584fdcc50 R11: 00000060dd8616f4 R12: ffff91e58132d000 R13: ffff91e584fdcc68 R14: ffff91e5869ce800 R15: ffff91e589c0f000 FS: 0000000000000000(0000) GS:ffff91e898100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f788f7c44c0 CR3: 0000000008e1a000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? __warn (kernel/panic.c:693) ? sk_mc_loop (net/core/sock.c:760) ? report_bug (lib/bug.c:201 lib/bug.c:219) ? handle_bug (arch/x86/kernel/traps.c:239) ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) ? sk_mc_loop (net/core/sock.c:760) ip6_finish_output2 (net/ipv6/ip6_output.c:83 (discriminator 1)) ? nf_hook_slow (net/netfilter/core.c:626) ip6_finish_output (net/ipv6/ip6_output.c:222) ? __pfx_ip6_finish_output (net/ipv6/ip6_output.c:215) ipvlan_xmit_mode_l3 (drivers/net/ipvlan/ipvlan_core.c:602) ipvlan ipvlan_start_xmit (drivers/net/ipvlan/ipvlan_main.c:226) ipvlan dev_hard_start_xmit (net/core/dev.c:3594) sch_direct_xmit (net/sched/sch_generic.c:343) __qdisc_run (net/sched/sch_generic.c:416) net_tx_action (net/core/dev.c:5286) handle_softirqs (kernel/softirq.c:555) __irq_exit_rcu (kernel/softirq.c:589) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043) The warning triggers as this: packet_sendmsg packet_snd //skb->sk is packet sk __dev_queue_xmit __dev_xmit_skb //q->enqueue is not NULL __qdisc_run sch_direct_xmit dev_hard_start_xmit ipvlan_start_xmit ipvlan_xmit_mode_l3 //l3 mode ipvlan_process_outbound //vepa flag ipvlan_process_v6_outbound ip6_local_out __ip6_finish_output ip6_finish_output2 //multicast packet sk_mc_loop //sk->sk_family is AF_PACKET Call ip{6}_local_out() with NULL sk in ipvlan as other tunnels to fix this. Fixes: 2ad7bf3 ("ipvlan: Initial check-in of the IPVLAN driver.") Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Yue Haibing <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]> (cherry picked from commit b3dc6e8) Signed-off-by: Jonathan Maple <[email protected]>
General Process:
src.rpm
5.14.0-570
git cherry-pick
rpmbuild -bp
from corresponding src.rpm.Notes
There is no jira for this yet as we're working on early automation
Checking Rebuild Commits for potentially Missing Commits:
resf_kernel-5.14.0-570.12.1.el9_6
The FIRST tag has A LOT of
Number of commits NOT found in upstream: 1858 (97.74%)
which can be reduced down to a bunch of entries to thekabi
list (this may be a result ofkernel-ark
style updatesresf_kernel-5.14.0-570.16.1.el9_6
resf_kernel-5.14.0-570.17.1.el9_6
KselfTest
This is the first build of Rocky 9.6 and some of the kernel selftests have changes. Specifically
pidfd
now stalls and hangs and does not exit. This may need investigated but lkdtm still doesn't work, see this about it: https://github.com/ctrliq/kernel-src-tree/tree/main/kselftests/lkdtm